Improving exploration in reinforcement learning through domain knowledge and parameter analysis

نویسنده

  • Marek Grzes
چکیده

This thesis presents novel work on how to improve exploration in reinforcement learning using domain knowledge and knowledge-based approaches to reinforcement learning. It also identifies novel relationships between the algorithms’ and domains’ parameters and the exploration efficiency. The goal of solving reinforcement learning problems is to learn how to execute actions in order to maximise the long term reward. Solving this type of problems is a hard task when real domains of realistic size are considered because the state space grows exponentially with each state feature added to the representation of the problem. In its basic form, reinforcement learning is tabula rasa, i.e. it starts learning with very limited knowledge about the domain. One of the ways of improving the performance of reinforcement learning is the principled use of domain knowledge. Knowledge is successful in related branches of artificial intelligence, and it is becoming increasingly important in the area of reinforcement learning as well. Reinforcement learning algorithms normally face the problem of deciding whether to execute explorative of exploitative actions, and the paramount goal is to limit the number of executions of suboptimal explorative actions. In this thesis, it is shown how domain knowledge and understanding of algorithms’ and domains’ properties can help to achieve this. Exploration is an immensely complicated process in reinforcement learning and is influenced by numerous factors. This thesis presents a new range of methods for dealing more efficiently with the exploration-exploitation dilemma which is a crucial issue of applying reinforcement learning in practice. Reward shaping was used in this research as a well established framework for incorporating procedural knowledge into model-free reinforcement learning. Two new ways of obtaining heuristics for potential-based shaping were introduced and evaluated: high level symbolic knowledge and the application of different hypothesis spaces to learn the heuristic.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Study of Qualitative Knowledge-Based Exploration for Continuous Deep Reinforcement Learning

As an important method to solve sequential decisionmaking problems, reinforcement learning learns the policy of tasks through the interaction with environment. But it has difficulties scaling to largescale problems. One of the reasons is the exploration and exploitation dilemma which may lead to inefficient learning. We present an approach that addresses this shortcoming by introducing qualitat...

متن کامل

Exploration/exploitation in Adaptive Recommender Systems

Interactive information systems are often designed on the basis of little knowledge about users goals and about the final content of the information base. In addition users vary widely in their interests. This makes it useful to give such systems the ability to dynamically adapt to its users. Here we focus on ”recommending” systems that help a user navigate through the information system. In pa...

متن کامل

Exploration / Exploitation in Adaptive Recommender

Interactive information systems are often designed on the basis of little knowledge about users goals and about the final content of the information base. In addition users vary widely in their interests. This makes it useful to give such systems the ability to dynamically adapt to its users. Here we focus on ”recommending” systems that help a user navigate through the information system. In pa...

متن کامل

Exploring parameter space in reinforcement learning

This paper discusses parameter-based exploration methods for reinforcement learning. Parameter-based methods perturb parameters of a general function approximator directly, rather than adding noise to the resulting actions. Parameter-based exploration unifies reinforcement learning and black-box optimization, and has several advantages over action perturbation. We review two recent parameter-ex...

متن کامل

Feature Engineering for Predictive Modeling using Reinforcement Learning

Feature engineering is a crucial step in the process of predictive modeling. It involves the transformation of given feature space, typically using mathematical functions, with the objective of reducing the modeling error for a given target. However, there is no well-defined basis for performing effective feature engineering. It involves domain knowledge, intuition, and most of all, a lengthy p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010